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1. INTRODUCTION 

Information and communication technologies (ICT) know a significant development especially in 
terms of hardware miniaturization, cost reduction and energy consumption optimization. This advancement 
enables the interconnection of a large number of physical objects namely using the Internet, forming what is 
called the internet of things (IoT). The IoT provides the opportunity to interact with these objects through 
sensors, actuators and smart applications which may help users in several areas such as transport, logistics, 
health care, agriculture, etc [1]. 

IoT represent a static objects that will be intelligent and able to share information and communicate 
with other devices in an autonomous way [2]. There are many elements used to run the IoT technology which 
include hardware and software such as sensors, GPS, cameras, applications, and so forth [3]. IoT devices are 
spread in different areas such as e-tracking, e-commerce, e-home, and e-health, etc. Thus, during the last ten 
years, the IoT technology has been a research focus [3]. These devices produce a big quantity of information, 
heterogeneous data, and their state changes very quickly (in a short period of time). 

Internet is a popular global information system where users can search relevant information using 
search engines (SE). SE is a type of software that organizes various content collected from all resource 
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available in the internet. With SE, users who are wishing to find information only need to enter a keyword 
about what they had like to see, and the search engine presents the links to the content that resembles what 
they need [4]. Searching in IoT networks has a different goal than the ones that typical search engines adapt 
where the users would operate the objects locally or remotely. As a result, this distinguishes between both 
sides and requires a new design concept for an IoT search engine. This is not simple due to the need to design 
new techniques of crawling, indexing, storing and querying [5]. 

Many IoT search engines [6]-[9] are designed to allow the search/retrieve and identification of 
connected things. Shodan.io is a search engine designed by programmer John Matherly in 2009. It 
interrogates devices ports and grabs the resulting banners, then indexes the corresponding public IP address 
and search into an intern databases for futures lookup [10]. Another popular IoT search engine is Censys, 
which collects all data it can about the connected devices in IPv4 on the net. it use the open source port 
scanner ZMap specially + ZGrab and stores everything it retrieves in a database, which is then accessible via 
the web interface, an API or plain text listings to download [11]. Thingful can be described as a 
“discoverable search engine” which allows its users to have a geographic index of connected objects around 
the world (https://thingful.net/). As such, Thingful boasts that it can index across multiple IoT networks and 
infrastructures, because this search engine can locate the geographical position of objects and devices [12]. 
But these search engines do not perfectly meet the need due to the quick changes of devices state and the 
complexity of their results, which require the development of a new mechanism for IoT devies retrieval 
which can respond to the different issues like real-time retrieval, fast response and accurate results. 

This work aims to propose an on-line tool for real-time retrieval of connected things in worldwide 
and descriptive informations related to these devices based on network port scanning technique. The paper 
starts by introducing the basic concepts related to the development of the proposed tool. In Section 3, the 
specifaction requirements and the proposed tool are presented. Then, in Section 4 we will present and discuss 
the results. Lastly, the conclusion and future improvements. 


2. BACKGROUND 
2.1. Internet of things and connected things 

IoT represent a giant infrastructure that enable machine-to-machine communication, remote 
monitoring and control of objects/devices in many fields and applications such as industry, agriculture, 
healthcare and education. It represents a network of connected things which are connected to IoT, and able to 
gather and share information related to the way they are employed and almost the environment around them. 
IoT represent the main focus of many research works [13]-[18] in latest years. 

Connected things refer to smart devices, autonomous electronic devices that may be connected with 
each others in a network, mobile devices, computing devices which are typically small enough to be 
handheld. These things are connected by using various wired and wireless networks and protocols (Wi-Fi, 3G 
and 4G networks... etc.), and are usually monitored and controlled remotely. They are commonly embedded 
with a set of technologies such as processing chips, software, and sensors. 

Things, in the IoT sense, can refer to a wide variety of devices such as heart monitoring implants, 
biochip transponders on farm animals, cameras streaming live feeds of wild animals in coastal waters, 
automobiles with built-in sensors, DNA analysis devices for environmental, food, pathogen monitoring, or 
field operation devices that assist fire fighters in search and rescue operations. Legal scholars suggest 
regarding “things” as an “inextricable mixture of hardware, software, data and service [19]. 


2.2. Searching 

Web sites, which index and class other web sites according to their keywords, explanations and 
contents and make it easier and faster to reach obtained site-search results, are called as search engines [20]. 
Since their appearance in the 90s, they recognize a great success and presents a change in the way of 
information retrieval. It is a tool based on a set of algorithms which allows its users to search and access to a 
huge amount of web information in an easy way and also to have well-organized results. These engines 
become smart due to the integration of new methods like machine learning for results classification task and 
interpretation of requests. 

IoT has a set of special features which present a great challenges for traditional search engines, in 
order to respond to these issues and continue the success of search engines with the large number of IoT 
devices joining the Web every day a new evolution of these tools appeared entitled IoT search engines [21]- 
[23]. It’s a solution that allows us to obtain a new search tool able to find connected devices and information 
about them, and also solve a set of internet of things issues. 

With the emergence of the internet of things, challenges relating to network security, devices 
management, devices status, access control and anomaly detection bring managers and administrators of the 
IoT infrastructure to think to the design and develop a new support and mechanism. The use of IoT search 
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engines can alleviate the IoT challenges mentioned [24]-[26] because they have the ability to identify devices 
and services connected to the Internet as well as vulnerable devices, also they allow learning and search 
information about IoT. 


2.3. Network port scanning 

Network analysis represent a technique which scan network ports as a vulnerability analysis, and 
usually used for security assessment and system maintenance. In addition, it's among necessary ways 
employed by attackers to assemble their data. 

Network scanning consists of network port scanning as well as vulnerability scanning [27]. Network 
port scanning refers to the way of dispatch information packets via the network to a computing system's 
given service port numbers (for example, port twenty-three for Telnet, port eighty for protocol so on). This is 
often to spot the on the world network services on it explicit system. Network port scanning moreover as 
vulnerability scanning is associate degree information-gathering technique, however once applied by 
anonymous people, these are viewed as a prelude to associate degree attack. Network scanning processes, 
like port scans and ping sweeps, come details regarding that information science addresses map to active live 
hosts and therefore the kind of services they supply [28]. It can be done in an easy way by using the available 
scanning tools like nmap [29], Angry Ip Scanner [30], Advanced Port Scanner [31]. 


2.4. Multiprocessing 

Multiprocessing or parallel processing is a type of processing which serve to run a set of tasks 
simultaneously on multiple processors in Figure 1. It aims to get more work done in a shorter period of time 
and reduce overall processing time than the serial processing. This type is typically used when very high 
speed is required to process a large volume of data. Multiprocessing serve to distribute a complex and larger 
tasks into multiple and smaller calculations, when each sub-process will have a dedicated CPU and memory 
slot. It refers to the ability of a system to support more than one processor at the same time and 
independently. 





Task 1: CPU | Task 2: CPU Task 3: CPU | 
+ Memory + Memory + Memory 


























Figure 1. Multiprocessing 


Multiprocessing can be used to improve existing version of different proposed solutions by speeding 
the processing time, like the work presented by Li et al. [32]. They develop an efficient guide RNA library 
designing tool entitled MultiGuideScan. It represents a multi-processing version of GuideScan software 
(developed to design CRISPR guide RNA libraries, which can be used for genome editing of coding and 
noncoding genomic regions effectively [32]). Experiments prove that the proposed solution speeds up the 
design of RNA guide library about 9-12 times by using 32 process than the original GuideScan. 


3. RESEARCH METHOD 

The main idea of this work is to propose a retrieval tool that provide to users all current available 
informations of each thing in request with minimum delay possible. The informations about devices in 
request are collected in real-time by using network port scanning technique especially we used python-nmap 
library. This data are collected from a set of scans, where each scan is responsible for retrieving a set of 
specific information which can take a significant time. In the aim to reduce data collection time we elaborate 
a parallel scans which serve to perform all scan in the same time and as fast as possible. 


3.1. Software requirement specification 


Requirement specification is the first step to define when developing a tool or application. For that 
we present in this sub-section the essentials requirements for the development of our solution; a) provide 
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simple GUI and easy to use, b) provide accurate and understandable results, c) provide maximum of available 
information related to connected things, d) allow its users to perform real-time retrieval, e) users do not need 
any technical knowledge, f) free (no registration required), g) unlimited number of searches, h) full access to 
results. 


3.2. Proposed algorithm 
The flowchart of the proposed algorithm is shown in Figure 2 which include the following steps: 
a. The first step aim to send user query to the server: 
Query aim to specifiy target host Which can be represented by ip adress or hostname 
b. Launch a set of scans in parallel in order to find information relating to the device in request: 
Each scan is responsible for collecting specific data 
For scans which take up more time we divide them on sub scans as long as it is possible if not we 
launch similar scans in parallel 
c. Collect the results generated by the performed scans 
Case 1: collect the results from all scans and cobine them then move on to the next step 
Case 2: collect the results of each scan separately then move on to the next step 
d. Extract relevant informations to shown from collected data and send them to users 
Case 1: extract information from alll scan combined results 
Case 2: extract information of each scan results received separately 
Displays search results in an ergonomic and understanvble way on the system interface 
f. Due to the dynamic change of information, all scans are relaunshed within a specified period of time 
and the content of the page is automatically refreshed as long as the user accesses to results interface. 


© 





Figure 2. Proposed algorithm for real-time retrieval 
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4. RESULTS AND DISCUSSION 
The proposed retrieval tool is developed and desgnied as a web application to allow an easy way to 
use this solution and does not require any prior installation, as well as guaranteeing the use of the latest 
version of this tool. The proposed web application was developed by using open-source micro framework for 
web development in Python (flask-python [33]) and other technologies and python librairies like python- 
nmap [34]. This application is based on two main interfaces: 
— The first user interface in Figure 3 of this system aim to offer a simple and ergonomic interface that 
allows users to retrieve current data related to connected things and in an easy way. 
— The results interface is displayed in a short time after the user's request. This interface shows the current 
informations related to state, ports, protocols, os, device type, hostnames and addresses in Figure 4 and 
Figure 5. It offers a useful and clear visualization of all available data collected in real-time. 





Figure 3. Retrieval GUI 
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Figure 4. Results GUI: Part 1 
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Figure 5. Results GUI: Part 2 
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5. CONCLUSION 

This work has resulted a design of a new tool for real-time retrieval of connected things. The main 
objective of this tool was to allow real-time and online retrieval of connected devices, using network port 
scanning which allow collecting data/informations related to these things in real time. The important 
informations are extracted from the collected data and presented easily to be understandable to all users. For 
our future works, we will attempt to improve results and evaluate performance of the proposed tool. To this 
end, we are going to perform a set of tests related to parallel retrieval and response time, use other resources 
for data collection and improve the security side. 
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